10 research outputs found

    Process membership in asynchronous environments

    Get PDF
    The development of reliable distributed software is simplified by the ability to assume a fail-stop failure model. The emulation of such a model in an asynchronous distributed environment is discussed. The solution proposed, called Strong-GMP, can be supported through a highly efficient protocol, and was implemented as part of a distributed systems software project at Cornell University. The precise definition of the problem, the protocol, correctness proofs, and an analysis of costs are addressed

    Practical Utility of Knowledge-Based Analyses: Optimizations and Optimality for an Implementation of Asynchronous, Fail-Stop Processes (Extended Abstract).

    Full text link
    The Group Membership Problem is concered with propagating changes in the membership of a group of processes to the members of that group. A restricted version of this problem allows one to implement a fail-stop failure model of processes in an asynchronous environment assuming a crash failure model. While the ISIS Toolkit relies on this for its Failure Detector, the current specification of GMP sheds no light on how to implement it. We present a knowledge-based formulation, cast as a commit-style problem, that is not only easier to understand, but also makes clear where optimizations to the ISIS implementation are and are not possible. In addition, the epistemic formulation allows us to use the elegant results of knowledge-acquisition theory to discover a lower bound on the required number of messages, construct a minimal protocol, and discuss the tradeoffs between the message-minimal protocol and the optimized ISIS implementation

    Completeness of a Temporal Logic for Asynchronous Systems

    Full text link
    In this paper, we define a variant of temporal logic that is designed to capture the temporal and causal aspects of asynchronous distributed systems. In these systems, the usual physical concept of time based upon the notion of a global clock is relegated to a secondary role; causal dependency or necessary temporal precedence is fundamental. Causal dependence is just temporal order locally; globally it is based on communication. An instant of time is replaced by a consistent cut. The semantics of most temporal logics have been based on computation sequences, either linear or branching time. In these models, one views an execution of a system as a single sequence of events. In such models, a "spurious" linearization is introduced, effectively disregarding concurrency. Recently, however, partially ordered models have been considered because, it is argued, that they more accurately represent the system being studied. The main technical contribution of this paper is to show that such a logic is complete for the class of models defined by executions of asynchronous systems

    Completeness of a Temporal Logic for Asynchronous Systems

    Get PDF
    In this paper, we define a variant of temporal logic that is designed to capture the temporal and causal aspects of asynchronous distributed systems. In these systems, the usual physical concept of time based upon the notion of a global clock is relegated to a secondary role; causal dependency or necessary temporal precedence is fundamental. Causal dependence is just temporal order locally; globally it is based on communication. An instant of time is replaced by a consistent cut. The semantics of most temporal logics have been based on computation sequences, either linear or branching time. In these models, one views an execution of a system as a single sequence of events. In such models, a "spurious" linearization is introduced, effectively disregarding concurrency. Recently, however, partially ordered models have been considered because, it is argued, that they more accurately represent the system being studied. The main technical contribution of this paper is to show that such a logic is complete for the class of models defined by executions of asynchronous systems

    The Group Membership Problem in Asynchronous Systems

    Full text link
    The thesis formally defines the class of Process Group Membership Problems (GMP) for asynchronous systems. These problems involve maintaining a list of processes belonging to the system, and updating it as processes join (are started) and leave (terminate or fail). We investigate closely the strongest member of the GMP class. Strong GMP presents this list in a consistent manner to all processes using it: the sequence of joins and leaves are identical. We show that despite prevalent beliefs, strong consistency and efficiency are not conflicting goals. This should have significant implications for distributed systems since the need for process membership agreement arises in many canonical problems in distributed computing. We present an inexpensive means (the S-GMP algorithm) of assuring complete, system-wide agreement on process membership. We discuss the role of process membership in distributed systems and how to use S-GMP to build a Membership Resource Manager (MRM). The thesis also examines whether any weaker member of the GMP class suffices to specify an MRM. In doing so we justify using Strong GMP over two much weaker GMP instances in three important ways. First, by comparing Strong GMP and its minimal solution with the weaker instances and their minimal solutions, we arrive at the surprising result that Strong GMP is often less expensive than the others, notably in executions in which membership changes are frequent. Second, we show that a membership service defined by Strong GMP is more robust, more responsive and more adaptable than a membership service defined by weaker GMP instances. Third, we compare membership services defined by the various GMPs according to the utility each service provides higher-level, distributed applications. That is, ignoring implementation costs, how useful are the different GMP consistency guarantees as a platform on which to build distributed solutions to distributed problems? We show that the consistency guarantees of Strong GMP make it (i.e. and the membership service Strong GMP defines) more useful to higher-level distributed applications. Finally, the thesis presents experimental results from implementing S-GMP. The data demonstrate that a centralized Membership Resource Manager is a non-intrusive service around which to design distributed systems and provide system-wide consistency. The data quantify and help clarify the tradeoffs between replication degree, overall system size and process failure frequency. These initial results should guide future MRM design and development

    Using Process Groups to Implement Failure Detection in Asynchronous Environments

    Get PDF
    Agreement on the membership of a group of processes in a distributed system is a basic problem that arises in a wide range of applications. Such groups occur when a set of processes co-operate to perform some task, share memory, monitor one another, subdivide a computation, and so forth. In this paper we discuss the Group Membership Problem as it relates to failure detection in asynchronous, distributed systems. We present a rigorous, formal specification for group membership under this interpretation. We then present a solution for this problem that improves upon previous work

    A New Model for Availability in the Face of Self-Propagating Attacks

    No full text
    this document are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Air Force or the U.S. Government

    Understanding Partitions and the "No Partition" Assumption

    Full text link
    The paper discusses partitions in asynchronous message-passing systems. In such systems slow processes and slow links can lead to virtual partitions that are indistinguishable from real ones. This raises the following question: what is a "partition" in an asynchronous system? To overcome the impossibility of detecting crashed processes in an asynchronous system, our system model incorporates a failure suspector to detect (possibly erroneously) process failures. Based on failure suspicions we give a definition of partitions that acccounts for real partitions as well as virtual ones. We show that under certain assumptions about the process behavior, any incorrect failure suspicion inevitably partitions the system. We then show how to interpret the "absence of partition" assumption
    corecore